We looked at a comparison between a few ST analyses on the HER2-positive breast tumors provided in Andersson et al. (2021). The data consists of 7 patients each with 3-6 slices. So far, this analysis is only on patient ‘A’. The following analyses were performed:
BayesTME applied on each slice. We enforce 6 cell types and lambda of 10000. We sample 1000 times, with 2000 burn in steps and a thinning of 5.
Latent Direchlet Allocation (LDA) applied on each slice, based on sklearn.decomposition.LatentDirichletAllocation. We enforce 6 cell types.
LDA applied on each patient across all slices. Again, we enforce 6 cell types.
The baseline analysis performed by Andersson et al. (2021). K-Nearest neighbors clustering is performed on gene expression at the patient+cohort level. This was used to set the number of cell types in each slice.
This set of figures shows the spot probability by cell type for each of the 4 methods.
This set of figures considers only the BayesTME and slice-based LDA. For each BayesTME cell type, we compute the 5 highest marker genes. We then analyze gene-topic probability from the slice-based LDA, to assess whether both methods have similar gene-latent space relationships.